13 research outputs found

    Medical image retrieval and automatic annotation: VPA-SABANCI at ImageCLEF 2009

    Get PDF
    Advances in the medical imaging technology has lead to an exponential growth in the number of digital images that needs to be acquired, analyzed, classified, stored and retrieved in medical centers. As a result, medical image classification and retrieval has recently gained high interest in the scientific community. Despite several attempts, such as the yearly-held ImageCLEF Medical Image Annotation Competition, the proposed solutions are still far from being su±ciently accurate for real-life implementations. In this paper we summarize the technical details of our experiments for the ImageCLEF 2009 medical image annotation task. We use a direct and two hierarchical classification schemes that employ support vector machines and local binary patterns, which are recently developed low-cost texture descriptors. The direct scheme employs a single SVM to automatically annotate X-ray images. The two proposed hierarchi-cal schemes divide the classification task into sub-problems. The first hierarchical scheme exploits ensemble SVMs trained on IRMA sub-codes. The second learns from subgroups of data defined by frequency of classes. Our experiments show that hier-archical annotation of images by training individual SVMs over each IRMA sub-code dominates its rivals in annotation accuracy with increased process time relative to the direct scheme

    İstatiksel öğrenme ve sürekli optimizasyon yöntemlerinin sonsuz ve yarı sonsuz programlama kullanılarak hesaplamalı istatistiğe uygulanması.

    No full text
    A subfield of artificial intelligence, machine learning (ML), is concerned with the development of algorithms that allow computers to “learn”. ML is the process of training a system with large number of examples, extracting rules and finding patterns in order to make predictions on new data points (examples). The most common machine learning schemes are supervised, semi-supervised, unsupervised and reinforcement learning. These schemes apply to natural language processing, search engines, medical diagnosis, bioinformatics, detecting credit fraud, stock market analysis, classification of DNA sequences, speech and hand writing recognition in computer vision, to encounter just a few. In this thesis, we focus on Support Vector Machines (SVMs) which is one of the most powerful methods currently in machine learning. As a first motivation, we develop a model selection tool induced into SVM in order to solve a particular problem of computational biology which is prediction of eukaryotic pro-peptide cleavage site applied on the real data collected from NCBI data bank. Based on our biological example, a generalized model selection method is employed as a generalization for all kinds of learning problems. In ML algorithms, one of the crucial issues is the representation of the data. Discrete geometric structures and, especially, linear separability of the data play an important role in ML. If the data is not linearly separable, a kernel function transforms the nonlinear data into a higher-dimensional space in which the nonlinear data are linearly separable. As the data become heterogeneous and large-scale, single kernel methods become insufficient to classify nonlinear data. Convex combinations of kernels were developed to classify this kind of data [8]. Nevertheless, selection of the finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, we propose a novel method of “infinite” kernel combinations for learning problems with the help of infinite and semi-infinite programming regarding all elements in kernel space. This will provide to study variations of combinations of kernels when considering heterogeneous data in real-world applications. Combination of kernels can be done, e.g., along a homotopy parameter or a more specific parameter. Looking at all infinitesimally fine convex combinations of the kernels from the infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann-Stieltjes) integral constraint due to the combinations. After a parametrization in the space of probability measures, it becomes semi-infinite. We analyze the regularity conditions which satisfy the Reduction Ansatz and discuss the type of distribution functions within the structure of the constraints and our bilevel optimization problem. Finally, we adapted well known numerical methods of semiinfinite programming to our new kernel machine. We improved the discretization method for our specific model and proposed two new algorithms. We proved the convergence of the numerical methods and we analyzed the conditions and assumptions of these convergence theorems such as optimality and convergence.Ph.D. - Doctoral Progra

    Infinite kernel learning via infinite and semi-infinite programming

    No full text
    As data become heterogeneous, multiple kernel learning methods may help to classify them. To overcome the drawback lying in its (multiple) finite choice, we propose a novel method of 'infinite' kernel combinations for learning problems with the help of infinite and semi-infinite optimizations. Looking at all the infinitesimally fine convex combinations of the kernels from an infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann-Stieltjes) integral constraint due to the combinations. After a parametrization in the space of probability measures, we get a semi-infinite programming problem. We analyse regularity conditions (reduction ansatz) and discuss the type of density functions in the constraints and the bilevel optimization problem derived. Our proposed approach is implemented with the conceptual reduction method and tested on homogeneous and heterogeneous data; this yields a better accuracy than a single-kernel learning for the heterogeneous data. We analyse the structure of problems obtained and discuss structural frontiers, trade-offs and research challenges

    Learning with infinitely many kernels via semi-infinite programming

    No full text
    Abstract. In recent years, learning methods are desirable because of their reliability and efficiency in real-world problems. We propose a novel method to find infinitely many kernel combinations for learning problems with the help of infinite and semi-infinite optimization regarding all elements in kernel space. This will provide to study variations of combinations of kernels when considering heterogeneous data in real-world applications. Looking at all infinitesimally fine convex combinations of the kernels from the infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann-Stieltjes) integral constraint due to the combinations. After a parametrisation in the space of probability measures it becomes semi-infinite. We analyze the conditions which satisfy the Reduction Ansatz and discuss the type of distribution functions of the kernel coefficients within the structure of the constraints and our bilevel optimization problem

    Analysis of SNP-Complex Disease Association by a Novel Feature Selection Method

    No full text
    Selecting a subset of SNPs (Single Nucleotide Polymorphism pronounced snip) that is informative and small enough to conduct association studies and reduce the experimental and analysis overhead has become an important step toward effective disease-gene association studies. In this study, we developed a novel methods for selecting Informative SNP subsets for greater association with complex disease by making use of methods of machine learning. We constructed an integrated system that makes use of major public databases to prioritize SNPs according to their functional effects and finds SNPs that are closely associated with genes which are proven to be associated with a particular complex disease. This helped us gain insights for understanding the complex web of SNPs and gene interactions and integrate as much as possible of the molecular level data relevant to the mechanisms that link genetic variation and disease. We tested the validity and accuracy of developed model by applying it to real life case control data set and got promising results. We hope that results of this study will support timely diagnosis, personalized treatments, and targeted drug design, through facilitating reliable identification of SNPs that are involved in the etiology of complex diseases
    corecore